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Abstract. The problem of finding elliptical shapes in an image will be 
considered. We discuss the solution which uses cross-entropy clustering. 
The proposed method allows the search for ellipses with predefined sizes 
and position in the space. Moreover, it works well for search of ellipsoids 
in higher dimensions. 
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1 Introduction 

Ellipse detection is one of the most important problems in image processing. 
It has been researched using a good variety of methods, see i.e. Tsuji and Mat- 
sumoto [14] , Davies [3] . Most of the existing techniques use the Hough Transform 
[7] - that is very memory and time consuming. 

In this paper a new approach will be presented and its advantages and disad- 
vantages will be discussed. We show the results of the algorithm on the pictures 
from Fig. [I] The algorithm discussed in this paper: 

— is easily adaptable, ie. if we know the expected shape of the object sought, 
or its position (orientation) in space, by little calculation we can prepare a 
proper configuration for its detection; 

— can detect simultaneously multiple type of objects, ex. we can look for 
matches and coins at the same time; 

— is rather insensitive to the disturbance of the picture (such as bluring, con- 
trast and illumination modification, etc); 

— can be used for classification (we can detect specified shapes) and for clus- 
tering (we can use it for exploring the data structure). 

The acceptable disadvantage of the presented method is that to work well we 
need the beforehand knowledge that on the picture we study there are no other 
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Fig. 1: The result of our algorithm: Fig [la] - 
image, the input for our algorithm, Fig [Ta| 
marked in different colors. 



original image, Fig lb - binarized 



outcome form algorithm, clusters 



objects than ellipse-like shapes. Consequently, our approach is well-adapted for 
example to the following tasks: 

— count the number of ellipses on the picture; 

— divide the shapes into circles of different radiuses; 

— count the number of vertical and horizontal ellipses. 

Our idea uses a cross-entropy clustering [13] (CEC), which from the practical 
point of view can be seen as joining of the k- means method with the model 
approach used in expectation maximization (EM). EM |9|1|10] is one of the basic 
and most important applications of maximal likelihood in the density estimations 
[8 . EM, or its variations like classification EM [12] are often applied in clustering. 
Although EM approach is quite general, and gives good results, to apply it we 
usually need to first perform complicated computations. Moreover, to accomplish 
the M step one commonly needs numerically consuming minimization techniques, 
and consequently EM is relatively slow and cannot deal well with large data. 

Our aim in this paper is to show that CEC is well-adapted to classifica- 
tion and detection of ellipses and ellipsoids. The advantage of CEC over EM is 
simplicity and speed - in the case of typical Gaussian families we do not need 
the M-step, which enables us in particular to use fast and efficient Hartigans 
approach. Moreover, as the use of every cluster in CEC has its cost, contrary 
to classification EM, CEC reduces on-line clusters which carry no information, 
which in practice implies that our algorithm can find the "right" number of 
ellipses on the picture. 

Let us discuss the contents of the paper. In the first part of our work we briefly 
describe the CEC algorithm. In the next section we present the basic models we 
use (compare with [4]). We also present results of numerical experiments. Then 
we describe the procedure for finding toothpicks in the image (see Fig. [I]). 

In Appendix we provide the proof of the only cross-entropy formula from 
section which is essentially new. In our opinion its proof is worth including as 
in fact it given a method which can be easily used in search for cross-entropy in 
other Gaussian subfamilies. 
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2 Theoretical background of CEC 

In this section we give a short introduction to CEC, for more detailed explanation 
we refer the reader to [13 . To explain CEC we need to introduce the "energy 
function" we want to minimize. By the cross-entropy of the probability measure \i 
(which represent the data-set we study) with respect to density / we understand 

H x {^\\f)=- I lnf(y)dfi(y). 

The above cross-entropy corresponds to the theoretical code-length of compres- 
sion of /i-randomly chosen element of Ht^ with the code optimized for density 
/ [2]. In a more general case when one is interested in (best) coding for \i by 
densities chosen from family J 7 , we arrive at the cross- entropy of \i with respect 
to a family of coding densities T 

H*fa\\T):=iriH*fa\\f). 

In the case of splitting of IR^ into pairwise disjoint sets U±, . . . , Un such that 
elements of U\ we "code" by optimal density from family J 1 ^, the mean code- 
length of randomly chosen element x equals 

k 

J!; . . . ; U n , F n ) := £ fi(U % ) • (- ln(/i(^)) + H x (^ ||^)), (1) 

i=l 

where denotes the normalized restriction of \i to the set U and is given by 
^u(A) := ^fi(AHU). 

The aim of CEC is to find splitting of IR^ into pairwise disjoint sets Ui 
which minimize the function given in ([T]). In this paper we restrict for the sake 
of simplicity to clusters generated by Gaussian densities (although one can easily 
use any density family for which MLE can be performed). 

Now we proceed with discussion of the Gaussian models we will use in CEC. 
We consider following density families: 

1. Qe ~ Gaussian densities with covariance E. The clustering will have the 
tendency to divide the data into clusters resembling the unit circles in the 
Mahalanobis distance given by \\x — y\\ 2 E := (x — y) T U(x — y). Its particular 
important subfamily is given by Q T \, where r > is fixed (in this case we 
will have tendency to divide the data into "circles" with approximate radius 
of y/r). 

2- £/(./) ~ spherical Gaussian densities, which covariance is proportional to iden- 
tity. The clustering will try to divide the data into circles of arbitrary sizes. 

3. (?diag ~ Gaussians with diagonal covariance. The clustering will try to divide 
the data into ellipsoid with radiuses parallel to coordinate axes. 

4. Q - all Gaussian densities. In this case we divide dataset into ellipsoid-like 
clusters without any preferences concerning the size or shape or position in 
space of the ellipsoid. 
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We need a result which says what is the cross-entropy of the probability mea- 
sure \i with respect to coding adapted for the respective Gaussian subfamilies. 
A basic role is played by the following observation. 

Observation 21 Let ji be a discrete or continuous probability measure in JR N 
with well-defined mean m M := J xdja(x) and covariance matrix := J(x — 
^V)(# — m^) T dfi(x). Let a fixed positive- definite symmetric matrix £ be given. 

Then H x ^fi\\Gu) = H x (/ig||A/"(m /x , ZJ)), where fig denotes the probability 
measure with Gaussian density of the same mean and covariance as \i. Conse- 
quently 

H x {ii\\g s ) = yln(27r) + ^tr(IT%) + ^lndet(I7). (2) 

By applying the above proposition one can easily deduc^Jthe formulas for 
cross-entropy given the Table [I] 



T 


cov. matrix 




Gz 


E 


f ln(27r) + \ti{E- l E^) + \ lndet(Z') 


GrI 


rl 


f ln(27r) + itr(r M ) + f lnr 




tr(i: M ) T 

N 1 


f ln(27re/iV) + f ln(trI7 M ) 


Gdiag 




f ln(27re) + ± ln(det(diag(^))) 


6 




f ln(27re)+ |lndet(^) 



Table 1: Table of cross-entropy formulas with respect to Gaussian subfamilies. 



In the second column we give the formula for the covariance matrix of the 
Gaussian density which realizes the desired minimum of cross-entropy (obviously 
the mean is always the mean of the measure) . Simple applications of the formulas 
given above can be found on the Figure [2] 





(b) 



(d) 



Fig. 2: The simplest case: input and outcome for our algorithm applied to Gri 



(Fig. 2a and 2b) and £ diag (Fig. 2c and 2d). 



3 In practice all the formulas given in the are known, see for example [13] . 
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3 Case study 

Let us explain the method on the following simple problem: assume that we want 
to count the toothpicks on the Fig. [3j To do so we take a particular object and 
compute its covariance matrix. We have obtained a covariance with eigenvalues 

Ai = 4938.5 and A 2 = 5.7. 

Since we want to allow the toothpick to have any position in space, we introduce 
the set Gx 1 ,x 2 to consist of all Gaussian densities on the plane with covariance 
matrix having eigenvalues Ai and A2 (observe that this set is rotation and trans- 
lation invariant, but not scale invariant). 

Consider now a probability measure /i, representing our data, with covariance 
Up, with eigenvalues A^ > A2 > 0. By applying Proposition [l] (see Appendix) 



jointly with Observation 21 we easily conclude that the best approximation 
(understood in the maximal likelihood or equivalently cross-entropy, sense) of \i 
in (?Ai,a 2 is given by the Gaussian density with covariance matrix with the same 
eigenvectors as and eigenvalues Ai and A2. Consequently, the cross-entropy, 




Fig. 3: The result of our algorithm: Fig pal 
image, the input for our algorithm, Fig [3c 



original image, Fig 3b - binarized 



marked in different colors, Fig 3d 



outcome form algorithm, clusters 
outcome form algorithm, ellipses with the 



same mean and covariance as calculated by algorithm densities. 
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which plays the role of energy, H x (n\\G\ 1 ,\ 2 ) thanks to Q is given by 

H^WQ^m) = y In(27r) + 1(X1/X 1 + A£/A 2 ) + ^(ln(Ai) + ln(A 2 )). 

By applying Hartigan approach we can now find the splitting of the data into 
pairwise disjoint sets J7i,...,i7fc which minimizes the value of ([!]). Results of 
our method can be seen on Figure [3] (we omit here the natural preliminary 
binarization procedure). 

To visualize the found clusters, we draw the boundary of an ellipse with the 
same mean and covariance as a given density estimatoi0 

4 Conclusion 

We have proposed a new method, which uses cross-entropy clustering approach, 
to classification and detection of ellipse-like shapes. The main advantage of the 
method lies in the fact that it can be easily adapted to finding ellipses of desired 
shape and position in space. The basic disadvantage is that in current algorithm 
configuration (basic approach) we can deal only with pictures which contain 
only ellipse-like shapes (for example we cannot discover ellipses in a picture with 
ellipses and rectangles). Our further work will consist on elimination of this 
inconvenience. 

5 Appendix: how to compute MLE for Gaussian families 

The situation is very simple if we search for the MLE, or in other words for 
the minimum in Q in the class of diagonal matrices (subclass consisting of 
Gaussians with independent variables). A more requiring and difficult question 
is to find the desired minimum in the class of all Gaussians. Below we present 
an approach which allows to do this. 

We will use the well-known von Neumann trace inequality |6|llj : 

Theorem [von Neumann trace inequality]. Let E,F be complex N x N 
matrices. Then 

N 

|tr(£F)| <^>(£)- Si (F), ( 3 ) 

i=l 

where Si(D) denote the ordered (decreasingly) singular values of matrix D. 

Let us recall that for the symmetric positive matrix its eigenvalues coincide 
with singular values. 

Given Ai, . . . , A at G IR by 5ai,...,A;v we denote the set of all symmetric matri- 
ces with eigenvalues Ai, . . . , Xn- The following proposition plays the basic role 
in the search for optimal Gaussian densities, as it reduces the search from all 
symmetric matrices to search in the set of eigenvalues. Since its proof is short, 
we provide it for the sake of completeness. 

4 We recall that covariance matrix of a uniform density of an ellipse with radiuses 
ri, r2 is given by [rJ/4, 0; 0, r|/4], that is we draw the ellipse with radiuses 2\f\i. 
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Proposition 1. Let B be a symmetric nonnegative matrix with eigenvalues 
ft > • • • > ftv > • Let < Ai < . . . < \ N be fixed. Then 

min tr(AB) = V] A;ft. 
Aes Xl ,...,x N . 

Proof. Let ei denote the orthogonal basis build from the eigenvectors of and 
let operator A be defined in this base by A{ei) = A^. Then trivially 

min tr(AB) < tr(AB) = V A;ft. 

AeS\ 1 ,...,x N 

To prove the inverse inequality we will use the von Neumann trace inequality. 
Let A G Sai,...,Ajv be arbitrary. We apply the inequality Q for E = A at I — A, 
F = B. Since E 1 and F are symmetric nonnegatively defined matrices, their 
eigenvalues A at — A^ and ft coincide with singular values, and therefore by ([3| 

tv((\ N i - a)b) < y,(^n - = \ N & - E Ai &- ( 4 ) 

* i i 

Since tr((AjvI — A)B) = A at ft — tr(AB), from inequality Q we obtain that 
tr(AB) > E^A^ft. 

Corollary 1. Assume that we want to find the best fit of ji with covariance 
in the class G\ li ... 1 \ nJ where Ai > . . . > A n > 0. 

To do so we take the eigenvalues A^ > . . . > A^ corresponding to orthonormal 
eigenvectors e^, . . . , e%, and then E is given in the base as a diagonal matrix with 
Ai, . . . , A n on the diagonal. 
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